Is writing style predictive of scientific fraud?
نویسندگان
چکیده
The problem of detecting scientific fraud using machine learning was recently introduced, with initial, positive results from a model taking into account various general indicators. The results seem to suggest that writing style is predictive of scientific fraud. We revisit these initial experiments, and show that the leave-one-out testing procedure they used likely leads to a slight over-estimate of the predictability, but also that simple models can outperform their proposed model by some margin. We go on to explore more abstract linguistic features, such as linguistic complexity and discourse structure, only to obtain negative results. Upon analyzing our models, we do see some interesting patterns, though: Scientific fraud, for examples, contains less comparison, as well as different types of hedging and ways of presenting logical reasoning.
منابع مشابه
How to improve English articles writing methods
Introduction: Today, English and its use as an international language is agreed upon by all, and a wealth of scientific articles written every day in the whole world is in English. The purpose of this study is to project some commonly used grammatical points that native speakers of Persian, unfortunately, do not follow or do not have adequate knowledge to use when writing in English, but this s...
متن کاملStylometry-based Fraud and Plagiarism Detection for Learning at Scale
Fraud detection in free and natural text submissions is a major challenge for educators in general. It is even more challenging to detect plagiarism at scale and in online classes such as Massive Open Online Courses. In this paper, we introduce a novel method that analyses the writing style of an author (stylometry) to identify plagiarism. We will show that our system scales to thousands of sub...
متن کاملThe study and recognition of artistic dyes in the Islamic period of Iran in writing and painting (Based on poetry of Khorasanid style poets)
The main features of Iranian painting in the post-Islamic centuries are the association with Persian literature. Persian literature and Persian art have intrinsic links, since the artist and poet are based on the unit's vision, rooted in a culture and intellectual space, to create. The result of this poet's creation is a literary work, and this work can have all the features of the work of art....
متن کاملExploring Stylistic Variation with Age and Income on Twitter
Writing style allows NLP tools to adjust to the traits of an author. In this paper, we explore the relation between stylistic and syntactic features and authors’ age and income. We confirm our hypothesis that for numerous feature types writing style is predictive of income even beyond age. We analyze the predictive power of writing style features in a regression task on two data sets of around ...
متن کاملLinguistic Traces of a Scientific Fraud: The Case of Diederik Stapel
When scientists report false data, does their writing style reflect their deception? In this study, we investigated the linguistic patterns of fraudulent (N = 24; 170,008 words) and genuine publications (N = 25; 189,705 words) first-authored by social psychologist Diederik Stapel. The analysis revealed that Stapel's fraudulent papers contained linguistic changes in science-related discourse...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
- CoRR
دوره abs/1707.04095 شماره
صفحات -
تاریخ انتشار 2017